Towards Modular Development of Typed Unification Grammars
نویسندگان
چکیده
Development of large-scale grammars for natural languages is a complicated endeavor: Grammars are developed collaboratively by teams of linguists, computational linguists, and computer scientists, in a process very similar to the development of large-scale software. Grammars are written in grammatical formalisms that resemble very-high-level programming languages, and are thus very similar to computer programs. Yet grammar engineering is still in its infancy: Few grammar development environments support sophisticated modularized grammar development, in the form of distribution of the grammar development effort, combination of sub-grammars, separate compilation and automatic linkage, information encapsulation, and so forth. This work provides preliminary foundations for modular construction of (typed) unification grammars for natural languages. Much of the information in such formalisms is encoded by the type signature, and we subsequently address the problem through the distribution of the signature among the different modules. We define signature modules and provide operators of module combination. Modules may specify only partial information about the components of the signature and may communicate through parameters, similarly to function calls in programming languages. Our definitions are inspired by methods and techniques of programming language theory and software engineering and are motivated by the actual needs of grammar developers, obtained through a careful examination of existing grammars. We show that our definitions meet these needs by conforming to a detailed set of desiderata. We demonstrate the utility of our definitions by providing a modular design of the HPSG grammar of Pollard and Sag.
منابع مشابه
Partially Specified Signatures: A Vehicle for Grammar Modularity
This work provides the essential foundations for modular construction of (typed) unification grammars for natural languages. Much of the information in such grammars is encoded in the signature, and hence the key is facilitating a modularized development of type signatures. We introduce a definition of signature modules and show how two modules combine. Our definitions are motivated by the actu...
متن کاملA baseline method for compiling typed unification grammars into context free language models
This paper presents a minimal enumerative approach to the problem of compiling typed unification grammars into CFG language models, a prototype implementation and results of experiments in which it was used to compile some non-trivial unification grammars. We argue that enumerative methods are considerably more useful than has been previously believed. Also, the simplicity of enumerative method...
متن کاملAn Open-Source Environment for Compiling Typed Unification Grammars into Speech Recognisers
We present REGULUS, an Open Source environment which compiles typed unification grammars into context free grammar language models compatible with the Nuance Toolkit. The environment includes a large general unification grammar of English and corpus-based tools for creating efficient domainspecific recognisers from it. We will demo applications built using the system, including a speech transla...
متن کاملFlexible Structural Analysis of Near-Meet-Semilattices for Typed Unification-Based Grammar Design
We present a new method for directly working with typed unification grammars in which type unification is not well-defined. This is often the case, as large-scale HPSG grammars now usually have type systems for which many pairs do not have least upper bounds. Our method yields a unification algorithm that compiles quickly and yet is nearly as fast during parsing as one that requires least upper...
متن کاملA Framework for Inductive Learning of Typed-Unification Grammars
LIGHT, the parsing system for typed-unification grammars [3], was recently extended so to allow the automate learning of attribute/feature paths values. Motivated by the logic design of these grammars [2], the learning strategy we adopted is inspired by Inductive Logic Programming [4]; we proceed by searching through hypothesis spaces generated by logic transformations of the input grammar. Two...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 37 شماره
صفحات -
تاریخ انتشار 2011